The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
我们介绍了概率世界,这是一个新的全象征性的贝叶斯型号的语义解析和推理模型,作为对更具领域和任务通用NLU和AI的研究计划的第一步。人类创造了他们观察的内部心理模型,这极大地帮助理解和理解大量问题。在PWM中,句子的含义,获得世界的事实,以及推理的中间步骤都以人类可读的形式表达,具有可解释性的设计目标。 PWM是贝叶斯,专为能够概括新域和新任务而设计。我们派生并实现了一种推导算法,通过解析和释放捕获这些句子的语义的潜在世界模型来读取句子,并在两个域名问题答案数据集中评估它:(1)校对器和(2 )我们呼叫虚构的新数据集,旨在更具实际语言的代表,但仍然足够简单,以重新评估推理能力,同时对启发式鲁棒。我们的方法均优于两者的基线,从而将其值证明其作为概念验证。
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
为了实现良好的性能和概括性,医疗图像分割模型应在具有足够可变性的大量数据集上进行培训。由于道德和治理限制以及与标签数据相关的成本,经常对科学发展进行扼杀,并经过对有限数据的培训和测试。数据增强通常用于人为地增加数据分布的可变性并提高模型的通用性。最近的作品探索了图像合成的深层生成模型,因为这种方法将使有效的无限数据生成多种多样的数据,从而解决了通用性和数据访问问题。但是,许多提出的解决方案限制了用户对生成内容的控制。在这项工作中,我们提出了Brainspade,该模型将基于合成扩散的标签发生器与语义图像发生器结合在一起。我们的模型可以在有或没有感兴趣的病理的情况下产生完全合成的大脑标签,然后产生任意引导样式的相应MRI图像。实验表明,Brainspade合成数据可用于训练分割模型,其性能与在真实数据中训练的模型相当。
translated by 谷歌翻译
在这项工作中,我们介绍了亲和力-VAE:基于其相似性在多维图像数据中自动聚类和对象分类的框架。该方法扩展了$ \ beta $ -vaes的概念,其基于亲和力矩阵驱动的知情相似性损失组件。与标准的$ \ beta $ -VAE相比,该亲和力VAE能够在潜在表示中创建旋转不变的,形态上均匀的簇,并具有改进的群集分离。我们探讨了2D和3D图像数据上潜在空间的潜在分离和连续性的程度,包括模拟的生物电子冷冻术(Cryo-ET)体积,作为科学应用的一个例子。
translated by 谷歌翻译
在这项工作中,我们考虑了在线环境中提高模型预测控制(MPC)动态模型准确性的任务。即使可以学习预测模型并将其应用于基于模型的控制器,但这些模型也经常离线学习。在此离线环境中,首先收集培训数据,并通过详细的培训程序来学习预测模型。将模型训练至所需的精度后,然后将其部署到模型预测控制器中。但是,由于模型是离线学习的,因此它不适合部署过程中观察到的干扰或模型错误。为了提高模型和控制器的适应性,我们提出了一个在线动力学学习框架,该框架不断提高部署过程中动态模型的准确性。我们采用基于知识的神经普通微分方程(KNODE)作为动态模型,并使用受转移学习启发的技术来不断提高模型的准确性。我们通过四型机器人证明了框架的功效,并在模拟和物理实验中验证框架。结果表明,所提出的方法能够说明可能段时间变化的干扰,同时保持良好的轨迹跟踪性能。
translated by 谷歌翻译
深度学习模型在识别医学图像中的发现方面表现出了极大的有效性。但是,他们无法处理不断变化的临床环境,从而带来了来自不同来源的新注释的医学数据。为了利用传入的数据流,这些模型将在很大程度上受益于从新样本中依次学习,而不会忘记先前获得的知识。在本文中,我们通过应用现有的最新持续学习方法介绍了MedMnist收集中连续疾病分类的基准。特别是,我们考虑了三种连续的学习方案,即任务和班级增量学习以及新定义的跨域增量学习。疾病的任务和班级增量学习解决了对新样本进行分类的问题,而无需重新从头开始模型,而跨域增量学习解决了处理源自不同机构的数据集的问题,同时保留了先前获得的知识。我们对表现进行彻底的分析,并研究如何在这种情况下表现出灾难性遗忘的持续学习挑战。令人鼓舞的结果表明,持续学习具有推进疾病分类并为临床环境产生更强大,更有效的学习框架的主要潜力。将公开提供完整基准测试的代码存储库,数据分区和基线结果。
translated by 谷歌翻译
域适应(DA)最近在医学影像社区提出了强烈的兴趣。虽然已经提出了大量DA技术进行了用于图像分割,但大多数这些技术已经在私有数据集或小公共可用数据集上验证。此外,这些数据集主要解决了单级问题。为了解决这些限制,与第24届医学图像计算和计算机辅助干预(Miccai 2021)结合第24届国际会议组织交叉模态域适应(Crossmoda)挑战。 Crossmoda是无监督跨型号DA的第一个大型和多级基准。挑战的目标是分割参与前庭施瓦新瘤(VS)的后续和治疗规划的两个关键脑结构:VS和Cochleas。目前,使用对比度增强的T1(CET1)MRI进行VS患者的诊断和监测。然而,使用诸如高分辨率T2(HRT2)MRI的非对比度序列越来越感兴趣。因此,我们创建了一个无人监督的跨模型分段基准。训练集提供注释CET1(n = 105)和未配对的非注释的HRT2(n = 105)。目的是在测试集中提供的HRT2上自动对HRT2进行单侧VS和双侧耳蜗分割(n = 137)。共有16支球队提交了评估阶段的算法。顶级履行团队达成的表现水平非常高(最佳中位数骰子 - vs:88.4%; Cochleas:85.7%)并接近完全监督(中位数骰子 - vs:92.5%;耳蜗:87.7%)。所有顶级执行方法都使用图像到图像转换方法将源域图像转换为伪目标域图像。然后使用这些生成的图像和为源图像提供的手动注释进行培训分割网络。
translated by 谷歌翻译
计算机愿景中的经典问题是推断从几个可用于以交互式速率渲染新颖视图的图像的3D场景表示。以前的工作侧重于重建预定定义的3D表示,例如,纹理网格或隐式表示,例如隐式表示。辐射字段,并且通常需要输入图像,具有精确的相机姿势和每个新颖场景的长处理时间。在这项工作中,我们提出了场景表示变换器(SRT),一种方法,该方法处理新的区域的构成或未铺设的RGB图像,Infers Infers“设置 - 潜在场景表示”,并合成新颖的视图,全部在一个前馈中经过。为了计算场景表示,我们提出了视觉变压器的概括到图像组,实现全局信息集成,从而实现3D推理。一个有效的解码器变压器通过参加场景表示来参加光场以呈现新颖的视图。通过最大限度地减少新型视图重建错误,学习是通过最终到底的。我们表明,此方法在PSNR和Synthetic DataSets上的速度方面优于最近的基线,包括为纸张创建的新数据集。此外,我们展示了使用街景图像支持现实世界户外环境的交互式可视化和语义分割。
translated by 谷歌翻译